training generative adversarial network
Training Generative Adversarial Networks by Solving Ordinary Differential Equations
The instability of Generative Adversarial Network (GAN) training has frequently been attributed to gradient descent. Consequently, recent methods have aimed to tailor the models and training procedures to stabilise the discrete updates. In contrast, we study the continuous-time dynamics induced by GAN training. Both theory and toy experiments suggest that these dynamics are in fact surprisingly stable. From this perspective, we hypothesise that instabilities in training GANs arise from the integration error in discretising the continuous dynamics. We experimentally verify that well-known ODE solvers (such as Runge-Kutta) can stabilise training - when combined with a regulariser that controls the integration error. Our approach represents a radical departure from previous methods which typically use adaptive optimisation and stabilisation techniques that constrain the functional space (e.g.
Review for NeurIPS paper: Training Generative Adversarial Networks with Limited Data
Summary and Contributions: This work proposes to address the problem of limited data in GAN training with discriminator augmentation (DA), a technique which enables most standard data augmentation techniques to be applied to GANs without leaking them into the learned distribution. The method is simple, yet effective: non-leaking differentiable transformations are applied to real and fake images before being passed through the discriminator, both during discriminator and generator updates. To make transformations non-leaking, it is proposed to apply them with some probability p 1 such that the discriminator will eventually be able to discern the true underlying distribution. One challenge introduced with this technique is that different datasets require different amounts of augmentation depending on their size, and as such, expensive grid search is required for optimization. To eliminate the need for this search step an adaptive version called adaptive discriminator augmentation (ADA) is introduced.
Review for NeurIPS paper: Training Generative Adversarial Networks with Limited Data
All reviewers found this work interesting and addressing an important issue in GAN training. The authors did a great job in presenting their analyses and experiments. Please take the reviewers' comments into account in your next revision (particularly some presentation advices). The authors are encouraged to cite the following work for a similar "non-leaking" DA: https://arxiv.org/abs/2006.05338 We did not bring this out during discussion nor used this for or against the authors.)
Review for NeurIPS paper: Training Generative Adversarial Networks by Solving Ordinary Differential Equations
Weaknesses: The hypothesis as it stands now is somewhat under-substantiated. Concretely: From a numerical analysis point of view, truncation error order and long-time convergence of the discrete sequence from numerical differencing are separate concepts. RK methods have higher order convergence on fixed time intervals, but its domain of absolute stability is not fundamentally different from that of forward Euler. All explicit methods suffer from limited stability, especially for stiff or conservative systems. Figure 1 shows this effect, but if one takes smaller step sizes the Euler method will converge.
Review for NeurIPS paper: Training Generative Adversarial Networks by Solving Ordinary Differential Equations
The paper introduces a new perspective for explaining instability in GANs training by analyzing the continuous dynamics of the training algorithm. They first show that these dynamics should converge in the vicinity of the Nash equilibrium and then make the hypothesis that instability is due to the discretization of this dynamics. They show that using higher order ODE time integrators for solving the dynamics help stabilizing the training. The paper is clear, the reviewers agree that this brings a new perspective for analyzing and training GANs and that this is a significant contribution to this topic. The theoretical findings are backed up by a nice empirical evaluation and analysis.
Training Generative Adversarial Networks with Limited Data
Training generative adversarial networks (GAN) using too little data typically leads to discriminator overfitting, causing training to diverge. We propose an adaptive discriminator augmentation mechanism that significantly stabilizes training in limited data regimes. The approach does not require changes to loss functions or network architectures, and is applicable both when training from scratch and when fine-tuning an existing GAN on another dataset. We demonstrate, on several datasets, that good results are now possible using only a few thousand training images, often matching StyleGAN2 results with an order of magnitude fewer images. We expect this to open up new application domains for GANs.
Training Generative Adversarial Networks by Solving Ordinary Differential Equations
The instability of Generative Adversarial Network (GAN) training has frequently been attributed to gradient descent. Consequently, recent methods have aimed to tailor the models and training procedures to stabilise the discrete updates. In contrast, we study the continuous-time dynamics induced by GAN training. Both theory and toy experiments suggest that these dynamics are in fact surprisingly stable. From this perspective, we hypothesise that instabilities in training GANs arise from the integration error in discretising the continuous dynamics.
Training Generative Adversarial Networks with Adaptive Composite Gradient
Qi, Huiqing, Li, Fang, Tan, Shengli, Zhang, Xiangyun
The wide applications of Generative adversarial networks benefit from the successful training methods, guaranteeing that an object function converges to the local minima. Nevertheless, designing an efficient and competitive training method is still a challenging task due to the cyclic behaviors of some gradient-based ways and the expensive computational cost of these methods based on the Hessian matrix. This paper proposed the adaptive Composite Gradients (ACG) method, linearly convergent in bilinear games under suitable settings. Theory and toy-function experiments suggest that our approach can alleviate the cyclic behaviors and converge faster than recently proposed algorithms. Significantly, the ACG method is not only used to find stable fixed points in bilinear games as well as in general games. The ACG method is a novel semi-gradient-free algorithm since it does not need to calculate the gradient of each step, reducing the computational cost of gradient and Hessian by utilizing the predictive information in future iterations. We conducted two mixture of Gaussians experiments by integrating ACG to existing algorithms with Linear GANs. Results show ACG is competitive with the previous algorithms. Realistic experiments on four prevalent data sets (MNIST, Fashion-MNIST, CIFAR-10, and CelebA) with DCGANs show that our ACG method outperforms several baselines, which illustrates the superiority and efficacy of our method.
Keep Calm and train a GAN. Pitfalls and Tips on training Generative Adversarial Networks
Generative Adversarial Networks (GANs) are among the hottest topics in Deep Learning currently. There has been a tremendous increase in the number of papers being published on GANs over the last several months. GANs have been applied to a great variety of problems and in case you missed the train, here is a list of some cool applications of GANs. Now, I had read a lot about GANs, but never played with one myself. So, after going through some inspiring papers and github repos, I decided to try my hands on training a simple GAN myself and I immediately ran into problems.